AITopics | computational error

Collaborating Authors

computational error

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Zero-Shot Long-Context LLM Compression

Wang, Chenyu, Wang, Yihan

arXiv.org Artificial IntelligenceJun-10-2024

This study evaluates the effectiveness of zero-shot compression techniques on large language models (LLMs) under long-context. We identify the tendency for computational errors to increase under long-context when employing certain compression methods. We propose a hypothesis to explain the varied behavior of different LLM compression techniques and explore remedies to mitigate the performance decline observed in some techniques under long-context. This is a course report for COS 598D Machine Learning and Systems by Prof. Kai Li at Princeton University. Due to limited computational resources, our experiments were conducted only on LLaMA-2-7B-32K.

arxiv preprint arxiv, context length, quantization, (12 more...)

arXiv.org Artificial Intelligence

2406.06773

Country: Europe (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Scalable nonconvex inexact proximalsplitting

Neural Information Processing SystemsMar-14-2024, 11:01:19 GMT

We study a class of large-scale, nonsmooth, and nonconvex optimization problems. In particular, we focus on nonconvex problems with composite objectives. This class includes the extensively studied class of convex composite objective problems as a subclass. To solve composite nonconvex problems we introduce a powerful new framework based on asymptotically nonvanishing errors, avoiding the common stronger assumption of vanishing errors. Within our new framework we derive both batch and incremental proximal splitting algorithms. To our knowledge, our work is first to develop and analyze incremental nonconvex proximalsplitting algorithms, even if we were to disregard the ability to handle nonvanishing errors. We illustrate one instance of our general framework by showing an application to large-scale nonsmooth matrix factorization.

algorithm, computational error, nip, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

MaScQA: A Question Answering Dataset for Investigating Materials Science Knowledge of Large Language Models

Zaki, Mohd, Jayadeva, null, Mausam, null, Krishnan, N. M. Anoop

arXiv.org Artificial IntelligenceAug-17-2023

Department of Computer Science & Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India Abstract Information extraction and textual comprehension from materials literature are vital for developing an exhaustive knowledge base that enables accelerated materials discovery. Language models have demonstrated their capability to answer domain-specific questions and retrieve information from knowledge bases. However, there are no benchmark datasets in the materials domain that can evaluate the understanding of the key concepts by these language models. In this work, we curate a dataset of 650 challenging questions from the materials domain that require the knowledge and skills of a materials student who has cleared their undergraduate degree. We classify these questions based on their structure and the materials science domain-based subcategories. Further, we evaluate the performance of GPT-3.5 and GPT-4 models on solving these questions via zero-shot and chain of thought prompting. It is observed that GPT-4 gives the best performance (~62% accuracy) as compared to GPT-3.5. Interestingly, in contrast to the general observation, no significant improvement in accuracy is observed with the chain of thought prompting. To evaluate the limitations, we performed an error analysis, which revealed conceptual errors (~64%) as the major contributor compared to computational errors (~36%) towards the reduced performance of LLMs. We hope that the dataset and analysis performed in this work will promote further research in developing better materials science domain-specific LLMs and strategies for information extraction.

category, dataset, gpt-4-cot, (14 more...)

arXiv.org Artificial Intelligence

2308.09115

Country:

Asia > India > NCT > New Delhi (0.25)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.54)
Research Report > Experimental Study > Negative Result (0.54)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Bit-Parallel Deterministic Stochastic Multiplier

Vatsavai, Sairam Sri, Thakkar, Ishan

arXiv.org Artificial IntelligenceFeb-14-2023

Abstract--This paper presents a novel bit-parallel deterministic stochastic multiplier, which improves the area-energy-latency product by up to 10.6 10 In SC's unipolar format, W is an SB of N bits GEMM circuits used in deep learning accelerators [2]. Stochastic multipliers suffer from computational errors. Our proposed reduced lengths and deterministic bit-position correlations in multiplier achieves 32.2%, 42.8%, and 51.8% lower MAE a bit-parallel manner, thereby simultaneously minimizing the compared to uMUL [2], Jenson [3], and Gaines [1], respectively. In addition, our proposed multiplier achieves 10.6 10 Our multiplier achieves that as follows.

artificial intelligence, machine learning, multiplier, (11 more...)

arXiv.org Artificial Intelligence

2302.08324

Country: North America > United States > Kentucky > Fayette County > Lexington (0.16)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

Scalable nonconvex inexact proximal splitting

Sra, Suvrit

Neural Information Processing SystemsDec-31-2012

We study large-scale, nonsmooth, nonconconvex optimization problems. In particular, we focus on nonconvex problems with \emph{composite} objectives. This class of problems includes the extensively studied convex, composite objective problems as a special case. To tackle composite nonconvex problems, we introduce a powerful new framework based on asymptotically \emph{nonvanishing} errors, avoiding the common convenient assumption of eventually vanishing errors. Within our framework we derive both batch and incremental nonconvex proximal splitting algorithms. To our knowledge, our framework is first to develop and analyze incremental \emph{nonconvex} proximal-splitting algorithms, even if we disregard the ability to handle nonvanishing errors. We illustrate our theoretical framework by showing how it applies to difficult large-scale, nonsmooth, and nonconvex problems.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Nonconvex proximal splitting: batch and incremental algorithms

Sra, Suvrit

arXiv.org Machine LearningSep-17-2012

Within the unmanageably large class of nonconvex optimization, we consider the rich subclass of nonsmooth problems that have composite objectives---this already includes the extensively studied convex, composite objective problems as a special case. For this subclass, we introduce a powerful, new framework that permits asymptotically non-vanishing perturbations. In particular, we develop perturbation-based batch and incremental (online like) nonconvex proximal splitting algorithms. To our knowledge, this is the first time that such perturbation-based nonconvex splitting algorithms are being proposed and analyzed. While the main contribution of the paper is the theoretical framework, we complement our results by presenting some empirical results on matrix factorization.

artificial intelligence, machine learning, proximal, (16 more...)

arXiv.org Machine Learning

1109.0258

Country: Europe > Germany (0.04)

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback